Text-Independent Speaker Verification for Real Fast-Varying Noisy Environments
نویسندگان
چکیده
Investigating Speaker Verification in real-world noisy environments, a novel feature extraction process suitable for suppression of time-varying noise is compared with a fine-tuned spectral subtraction method. The proposed feature extraction process is based on approximating the clean speech and the noise spectral magnitude with a mixture of Gaussian probability density functions (pdfs) by using the Expectation-Maximization algorithm (EM). Subsequently, the Bayesian inference framework is applied to the degraded spectral coefficients, and by employing Minimum Mean Square Error Estimation (MMSE), a closed form solution for the spectral magnitude estimation task is derived. The estimated spectral magnitude finally is incorporated into the Mel-Frequency Cepstral Coefficients (MFCCs) front-end of a baseline text-independent speaker verification system, based on Probabilistic Neural Networks, which participated successfully in the 2002 NIST (National Institute of Standards and Technology of USA) Speaker Recognition Evaluation. A comparative study of the proposed technique for real-world noise types demonstrates a significant performance gain compared to the baseline speech features and to the spectral subtraction enhancement method. Improvements of the absolute speaker verification performance with more than 27% for 0 dB signal-to-noise ratio (SNR), compared to the MFCCs, and with more than 13% for -5 dB SNR, compared to the spectral subtraction version, were obtained in the case of a passing-by aircraft scenario.
منابع مشابه
Performance improvement of text-independent speaker verification systems based on histogram enhancement in noisy environments
In this paper a histogram enhancement technique is presented in order to improve the robustness of text-independent speaker verification systems. The technique transforms the features extracted from speech such that the contrast of their histogram is enhanced. Experiments showed significant improvements for this technique compared to standard techniques both in clean testing environments, and i...
متن کاملVoice biometric feature using Gammatone filterbank and ICA
Voice biometric feature extraction is the core task in developing any speaker identification system. This paper proposes a robust feature extraction technique for the purpose of speaker identification. The technique is based on processing monaural speech signal using human auditory system based Gammatone Filterbank (GTF) and Independent Component Analysis (ICA). The measures used to assess the ...
متن کاملRobust Support Vector Machines for Speaker Verification Task
An important step in speaker verification is extracting features that best characterize the speaker voice. This paper investigates a front-end processing that aims at improving the performance of speaker verification based on the SVMs classifier, in text independent mode. This approach combines features based on conventional Mel-cepstral Coefficients (MFCCs) and Line Spectral Frequencies (LSFs)...
متن کاملKernel-Based Probabilistic Neural Networks with Integrated Scoring Normalization for Speaker Verification
This paper investigates kernel-based probabilistic neural networks for speaker verification in clean and noisy environments. In particular, it compares the performance and characteristics of speaker verification systems that use probabilistic decision-based neural networks (PDBNNs), Gaussian mixture models (GMMs) and elliptical basis function networks (EBFNs) as speaker models. Experimental eva...
متن کاملText-dependent speaker verification under noisy conditions using parallel model combination
In real speaker verification applications, additive or convolutive noise creates a mismatch between training and recognition environments, degrading performance. Parallel Model Combination (PMC) is used successfully to improve the noise robustness of Hidden Markov Model (HMM) based speech recognisers [5]. This paper presents the results of applying PMC to compensate for additive noise in HMM-ba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- I. J. Speech Technology
دوره 7 شماره
صفحات -
تاریخ انتشار 2004